79 research outputs found

    Contrastive Registration for Unsupervised Medical Image Segmentation

    Full text link
    Medical image segmentation is a relevant task as it serves as the first step for several diagnosis processes, thus it is indispensable in clinical usage. Whilst major success has been reported using supervised techniques, they assume a large and well-representative labelled set. This is a strong assumption in the medical domain where annotations are expensive, time-consuming, and inherent to human bias. To address this problem, unsupervised techniques have been proposed in the literature yet it is still an open problem due to the difficulty of learning any transformation pattern. In this work, we present a novel optimisation model framed into a new CNN-based contrastive registration architecture for unsupervised medical image segmentation. The core of our approach is to exploit image-level registration and feature-level from a contrastive learning mechanism, to perform registration-based segmentation. Firstly, we propose an architecture to capture the image-to-image transformation pattern via registration for unsupervised medical image segmentation. Secondly, we embed a contrastive learning mechanism into the registration architecture to enhance the discriminating capacity of the network in the feature-level. We show that our proposed technique mitigates the major drawbacks of existing unsupervised techniques. We demonstrate, through numerical and visual experiments, that our technique substantially outperforms the current state-of-the-art unsupervised segmentation methods on two major medical image datasets.Comment: 11 pages, 3 figure

    Parsing is All You Need for Accurate Gait Recognition in the Wild

    Full text link
    Binary silhouettes and keypoint-based skeletons have dominated human gait recognition studies for decades since they are easy to extract from video frames. Despite their success in gait recognition for in-the-lab environments, they usually fail in real-world scenarios due to their low information entropy for gait representations. To achieve accurate gait recognition in the wild, this paper presents a novel gait representation, named Gait Parsing Sequence (GPS). GPSs are sequences of fine-grained human segmentation, i.e., human parsing, extracted from video frames, so they have much higher information entropy to encode the shapes and dynamics of fine-grained human parts during walking. Moreover, to effectively explore the capability of the GPS representation, we propose a novel human parsing-based gait recognition framework, named ParsingGait. ParsingGait contains a Convolutional Neural Network (CNN)-based backbone and two light-weighted heads. The first head extracts global semantic features from GPSs, while the other one learns mutual information of part-level features through Graph Convolutional Networks to model the detailed dynamics of human walking. Furthermore, due to the lack of suitable datasets, we build the first parsing-based dataset for gait recognition in the wild, named Gait3D-Parsing, by extending the large-scale and challenging Gait3D dataset. Based on Gait3D-Parsing, we comprehensively evaluate our method and existing gait recognition methods. The experimental results show a significant improvement in accuracy brought by the GPS representation and the superiority of ParsingGait. The code and dataset are available at https://gait3d.github.io/gait3d-parsing-hp .Comment: 16 pages, 14 figures, ACM MM 2023 accepted, project page: https://gait3d.github.io/gait3d-parsing-h

    H-DenseUNet for Kidney and Tumor Segmentation from CT Scans

    Get PDF
    Automatic kidney tumor segmentation from CT scans is an essential step for computer-aided diagnosis of cancer. In this paper, we present an improved H-DenseUNet for kidney and tumor segmentation. Specifically, we first train the DenseUNet and then fine tune the network with the 3D counterpart. To further increase the performance, we employ both cross-entropy and dice loss. We evaluate our method on the 2019 MICCAI kidney and tumor segmentation challenge. We split the training dataset of the challenge to 200 training set and 10 validation set. On the validation set, our method achieves 97.0% (Dice) for kidney segmentation and 67.2% (Dice) for tumor segmentation. This model is submitted to the challenge for final performance evaluation on the test dataset

    Why Deep Surgical Models Fail?: Revisiting Surgical Action Triplet Recognition through the Lens of Robustness

    Full text link
    Surgical action triplet recognition provides a better understanding of the surgical scene. This task is of high relevance as it provides to the surgeon with context-aware support and safety. The current go-to strategy for improving performance is the development of new network mechanisms. However, the performance of current state-of-the-art techniques is substantially lower than other surgical tasks. Why is this happening? This is the question that we address in this work. We present the first study to understand the failure of existing deep learning models through the lens of robustness and explainabilty. Firstly, we study current existing models under weak and strong δ\delta-perturbations via adversarial optimisation scheme. We then provide the failure modes via feature based explanations. Our study revels that the key for improving performance and increasing reliability is in the core and spurious attributes. Our work opens the door to more trustworthiness and reliability deep learning models in surgical science

    Homeomorphic Image Registration via Conformal-Invariant Hyperelastic Regularisation

    Full text link
    Deformable image registration is a fundamental task in medical image analysis and plays a crucial role in a wide range of clinical applications. Recently, deep learning-based approaches have been widely studied for deformable medical image registration and achieved promising results. However, existing deep learning image registration techniques do not theoretically guarantee topology-preserving transformations. This is a key property to preserve anatomical structures and achieve plausible transformations that can be used in real clinical settings. We propose a novel framework for deformable image registration. Firstly, we introduce a novel regulariser based on conformal-invariant properties in a nonlinear elasticity setting. Our regulariser enforces the deformation field to be smooth, invertible and orientation-preserving. More importantly, we strictly guarantee topology preservation yielding to a clinical meaningful registration. Secondly, we boost the performance of our regulariser through coordinate MLPs, where one can view the to-be-registered images as continuously differentiable entities. We demonstrate, through numerical and visual experiments, that our framework is able to outperform current techniques for image registration.Comment: 11 pages, 3 figure

    SCOTCH and SODA: A Transformer Video Shadow Detection Framework

    Full text link
    Shadows in videos are difficult to detect because of the large shadow deformation between frames. In this work, we argue that accounting for shadow deformation is essential when designing a video shadow detection method. To this end, we introduce the shadow deformation attention trajectory (SODA), a new type of video self-attention module, specially designed to handle the large shadow deformations in videos. Moreover, we present a new shadow contrastive learning mechanism (SCOTCH) which aims at guiding the network to learn a unified shadow representation from massive positive shadow pairs across different videos. We demonstrate empirically the effectiveness of our two contributions in an ablation study. Furthermore, we show that SCOTCH and SODA significantly outperforms existing techniques for video shadow detection. Code is available at the project page: https://lihaoliu-cambridge.github.io/scotch_and_soda/Comment: Accepted to CVPR 202

    MammoDG: Generalisable Deep Learning Breaks the Limits of Cross-Domain Multi-Center Breast Cancer Screening

    Full text link
    Breast cancer is a major cause of cancer death among women, emphasising the importance of early detection for improved treatment outcomes and quality of life. Mammography, the primary diagnostic imaging test, poses challenges due to the high variability and patterns in mammograms. Double reading of mammograms is recommended in many screening programs to improve diagnostic accuracy but increases radiologists' workload. Researchers explore Machine Learning models to support expert decision-making. Stand-alone models have shown comparable or superior performance to radiologists, but some studies note decreased sensitivity with multiple datasets, indicating the need for high generalisation and robustness models. This work devises MammoDG, a novel deep-learning framework for generalisable and reliable analysis of cross-domain multi-center mammography data. MammoDG leverages multi-view mammograms and a novel contrastive mechanism to enhance generalisation capabilities. Extensive validation demonstrates MammoDG's superiority, highlighting the critical importance of domain generalisation for trustworthy mammography analysis in imaging protocol variations

    Efficient Water-Splitting Device Based on a Bismuth Vanadate Photoanode and Thin-Film Silicon Solar Cells

    Get PDF
    A hybrid photovoltaic/photoelectrochemical (PV/PEC) water-splitting device with a benchmark solar-to-hydrogen conversion efficiency of 5.2 % under simulated air mass (AM) 1.5 illumination is reported. This cell consists of a gradient-doped tungsten–bismuth vanadate (W:BiVO_4) photoanode and a thin-film silicon solar cell. The improvement with respect to an earlier cell that also used gradient-doped W:BiVO4 has been achieved by simultaneously introducing a textured substrate to enhance light trapping in the BiVO4 photoanode and further optimization of the W gradient doping profile in the photoanode. Various PV cells have been studied in combination with this BiVO_4 photoanode, such as an amorphous silicon (a-Si:H) single junction, an a-Si:H/a-Si:H double junction, and an a-Si:H/nanocrystalline silicon (nc-Si:H) micromorph junction. The highest conversion efficiency, which is also the record efficiency for metal oxide based water-splitting devices, is reached for a tandem system consisting of the optimized W:BiVO_4 photoanode and the micromorph (a-Si:H/nc-Si:H) cell. This record efficiency is attributed to the increased performance of the BiVO_4 photoanode, which is the limiting factor in this hybrid PEC/PV device, as well as better spectral matching between BiVO_4 and the nc-Si:H cell
    corecore